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1 

The use of the KluyveromycGS marxianus Inulinase gene 
promoter for protein production. 

Technical Field 

5 The subject invention lies in the field of DNA technology. 
In particular the invention covers a nucleic acid sequence 
derivable from a yeast and comprising at least a regulatory 
region derivable from a gene encoding a polypeptide having 
inulinase activity. The invention is also directed at an 
10 expression vector comprising the aforementioned nucleic 
acid sequence and is furthermore directed at the use of 
said nucleic acid sequence or expression vector for 
producing a desired expression product. 

15 Background 

Yeast strains of the genus Kluyvsromyces have been used for 
the production of enzymes for many years, and the growth of 
these strains has been extensively studied. Kluyveromyces 
marxianus var. marxianus strains (hereinafter also called 
2 0 Kluyveromyces marxianus or marxianus) are well known for 
their ability to utilize a large variety of compounds as 
carbon and energy sources for growth. Since these strains 
are able to grow at high temperatures and exhibit high 
growth rates they are promising hosts for the industrial 

2 5 production of heterologous proteins. 

Among the substrates that support growth are 
polysaccharides such as inulin, xylan and pectin, which are 
degraded by extracellular enzymes. Inulinase {EC.3.2.1.7) 
is an extracellular enzyme that enables the yeast to grow 

3 0 on fructans such as inulin and sucrose. The enzyme occurs 

in two forms, whereby part of the enzyme is secreted into 
the culture fluid as a dimer and part is retained in the 
cell wall as a tetramer. The relative amounts of the cell 
wall and supernatant enzyme depend on cultivation 
3 5 conditions, a situation similar to that of invertase 

(E.G. 3. 2. 1.26) of Saccharomyces cerevlslae. The two enzymes 
differ in substrate specificity for inulin and sucrose, a 
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fact normally expressed in the S/I ratio (Vandamme et al • , 
1983). 

The pKDl plasmid (Falcone et ai • , 1986) originally found in 
Kluyveromyces drosophilarum (now regarded as a variety of 
5 KluyveromycBs lactis) belongs to the family of yeast double 
* stranded circular plasmids, and does not confer any evident 
phenotype. Based on plasmid pKDl several commercially 
attractive expression systems for high level expression of 
prochymosin (v/d Berg et al., 1991) and human serum albumin 
10 (Fleer et al . , 1991) have been developed for the yeast 

JCiuyverojnyces lactis. As known from S, cerevisiae, a high 
copy plasmid based expression system has the advantage of 
supplying the host with a sufficient number of gene copies 
to obtain high-level expression, while integration into the 
15 genome in single or low copy number increases the mitotic 
stability under non selective growth conditions. 
To combine the benefits of both expression systems, the 
concept of a multicopy integration system in the rDNA locus 
of S, cerevisiae has already been successfully proven 
20 (Verbakel, 1991) . The potential of these constructs for 
stable, multicopy integration into the genome, has been 
demonstrated for different organisms, genes and auxotrophic 
markers (Lopes, 1990; Verbakel, 1991, Bergkamp et al., 
1991) . Progress has been made elsewhere to stabilize a 
25 plasmid borne expression system (Fleer et al., 1991), the 
adaptation of multicopy integration into the genome of JC. 
marxianus is however a more favourable option; and is 
currently under investigation. Therefore the first Ieu2 
mutant of strain CBS 6556 has been made, in this 
3 0 specification named KMSl, in which the LEU2 gene is 
inactivated through integration of a pPGK/Neomycin 
resistance (Neo^) cassette (Bergkamp et al . , 1991). 
Within the transferring process of the multicopy 
integration system from S. CBrevisiae to K. marxianus, 
35 first a vector was developed, which is capable of 

integrating into the genome of K. lactis by targeted 
homologous recombination in the ribosomal DNA locus 
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(Bergkamp et al . , 1992). Using this vector system, the 
expression of an a-galactosidase gene from a fusion 
construct containing the 5. cerevisiae GAL7 promoter, the 
SUC2 invertase signal sequence was obtained. With a maximum 
5 number of integrated plasmids of about 15, a level of about 
250 mg/1 a-galactosidase was obtained, with a secretion 
efficiency of about 95%. Compared to the ARS- or pKDl 
derived iC. lactls vectors containing the fusion construct, 
the multicopy integrants exhibit a considerably higher 

10 stability under non-selective growth conditions. 

However, in addition to the importance of a stable, high 
copy, system the strengths and effectiveness of surrounding 
regulatory sequences seem to be crucial factors for high 
level expression, at least in K* marxianus . This is 

15 supported by results from the same group, where attempts to 
use the S. ceirevisiae GAL7 and PGK promoters for expression 
in K. marxianus have, in contrast to their effects in iT. 
lactis , only led to an extremely low yield. 
On the other hand, the homologous ORFl promoter of killer 

2 0 plasmid kl and the LAC4 promoter have already been 

successfully tested by different companies (Fleer et al . , 
1991, v/d Berg et al . , 1991). 

Yet another difficulty is the proficient secretion of 
recombinant protein by iCIuyverojnyces , especially when the 
25 protein is expressed in large quantities. Even though some 
heterologous secretion/signal sequences have been shown to 
be functional in K. lactis, as for example the human serum 
albumin prepro-sequence (Fleer et al . , 1991), there is a 
strong demand for an efficient homologous signal sequence, 

3 0 especially from marxianus. 

Description 

Since inulinase is known to be expressed in very high 
3 5 concentrations under appropriate cultivation conditions in 
K. marxianus, the present invention is directed in 
particular at the cloning of regulatory regions, such as 
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the promoter sequence and the signal sequence of the 
inulinase gene as promising components for the development 
of an expression system. 

This invention therefore relates generally to a nucleic 
5 acid sequence derivable from a yeast and comprising at 

least a regulatory region derivable from a gene encoding a 
polypeptide having inulinase activity • Said nucleic acid 
sequence can be combined with nucleic acid sequences 
encoding other homologous or heterologous genes to bring 
10 these genes under the control of at least one strong 
inulinase regulatory sequence. 

"Nucleic acid sequence" as used herein refers to a 
polymeric form of nucleotides of any length, thus to both 
single and double stranded deoxyribonucleic acid (DNA) 
15 sequences, to ribonucleic acid (RNA) sequences, as well as 
to modifications thereof. In principle a single stranded 
nucleic acid DNA refers to the primary structure of the 
molecule. 

In general the term "polypeptide" refers to a molecular 
20 chain of amino acids with a biological activity and does 
not refer to a specific length of the product and if 
required can be modified in vivo or in vitro. This 
modification can for example take the form of 
glycosylation, amidation, carboxylation or phosphorylation; 
25 thus, inter alia, peptides, oligopeptides and proteins are 
included. In this instance the polypeptide has inulinase 
activity. 

Yet another major aspect of the present invention is 
related to the isolation, characterization and the use of 

3 0 the signal sequence of a polypeptide having inulinase 
activity, and parts thereof, for secretion of any 
overexpressed product from yeast, in particular from 
Kluyveromyces . A nucleic acid sequence according to the 
invention can therefore optionally further comprise a 

3 5 nucleic acid sequence encoding a secretory signal of 
inulinase. 

The invention further relates to a vector containing the 
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nucleic acid sequences as described and also relates to 
micro-organisms containing said vectors or nucleic acid 
sequences. 

The invention is also directed at modified sequences of the 
5 aforementioned nucleic acid sequences according to the 
invention, said modified sequences also having regulatory 
activity- The term "a modified sequence" covers nucleic 
acid sequences having the regulatory activity equivalent to 
or better than the nucleic acid sequence derivable from a 

10 yeast and comprising at least a regulatory region derivable 
from a gene encoding a polypeptide having inulinase 
activity. Such an equivalent nucleic acid sequence can have 
undergone substitution, deletion or insertion, or a 
combination of the aforementioned, of one or more 

15 nucleotides resulting in a modified nucleic acid sequence 
without concomitant loss of regulatory activity occurring. 
Such modified nucleic acid sequences fall within the scope 
of the present invention. In particular modified sequences 
capable of hybridizing with the non modified nucleic acid 

2 0 sequence and still maintaining at least the regulatory 

activity of the unmodified nucleic sequence fall within the 
scope of the invention. 

The term "a part of" covers a nucleic acid sequence being a 
subsequence of the nucleic acid sequence derivable from a 
25 yeast and comprising at least a regulatory region derivable 
from a gene encoding a polypeptide having inulinase 
activity. The term "a part of" also covers a subsequence 
that is specific for the nucleic acid sequence derivable 
from a yeast and comprising at least a regulatory region 

3 0 derivable from a gene encoding a polypeptide having 

inulinase activity, said subsequence having a length of at 
least ten nucleotides and being capable of hybridizing to a 
regulatory region of a yeast inulinase gene under stringent 
conditions, said subsequence being suitable for use as a 
3 5 probe or a primer. The invention is in fact also directed 
at such use of a nucleic acid sequence according to the 
invention. 
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secretory signal necessary for secreting a gene product 
from a yeast. This will be preferred when intracellular 
production of a desired expression product is not 
sufficient and extracellular production of the desired 
5 expression product is required. Secretory signals comprise 
the prepro or pre sequence of the inulinase gene for 
example. A secretory signal derivable from the inulinase 
gene of a Kluyveromyces yeast is particularly favoured. The 
specific embodiment of the nucleic acid sequence according 
10 to the invention will however depend on the goal that is to 
be achieved upon using a sequence according to the 
invention. 

With the help of DNA oligonucleotides deduced from either 
existing protein sequencing data of inulinase from K. 

15 marxianus or newly obtained protein sequence analysis for 
example a 290 bp DNA fragment has been generated by use of 
the PGR technique, and the fragment has further been used 
for the isolation of chromosomal DNA fragments containing 
the whole inulinase gene of K. marxlanus including the 

20 regulatory regions, such as the promoter, the signal 

sequence and the termination sequence. The invention is 
therefore in particular directed at a nucleic acid sequence 
derivable from a yeast and comprising at least a regulatory 
region derivable from a gene encoding a polypeptide having 

25 inulinase activity in any of the embodiments described 
above, said nucleic acid sequence comprising at least a 
part of the nucleic acid sequence of figure 5 or an 
equivalent nucleic acid sequence. The term "equivalent 
nucleic acid sequence" has the same meaning as given above 

30 for "a modified nucleic acid sequence". 

Yet another aspect of this invention relates to the 
isolated nucleic acid fragment of marxianus containing 
an open reading frame encoding 556 amino acids, with 
nucleotides encoding a prepro-peptide sequence of 23 amino 

3 5 acids at the amino terminus. The calculated molecular 

weight of the corresponding gene product is 62.5 kDa, which 
is in good agreement with the 64 kDa experimentally 
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determined for the corresponding polypeptide (Rouwenhorst 
et ai . , 1990) . 

A further aspect of the invention is directed at processes 
for the construction of either episomal or integrating 
5 expression vectors containing the described regulatory 

sequence or sequences. In the given examples the expression 
and secretion potential of the obtained INU promoter and 
the INU signal sequences have been tested by constructing a 
variety of new vectors for expression of a heterologous a- 

10 galactosidase gene in Kluyveromyces , The resulting 

constructs were tested in K. marxianus , variety marxianus. 
Yet another aspect of this invention relates to a method of 
transforming a ICIuyveromyces strain capable of producing a 
heterologous protein through fusion with the prepro- and 

15 the pre-part of the homologous inulinase signal sequence. 
High expression levels and nearly complete secretion were 
obtained with all episomal plasmids that were constructed. 
A strain, transformed with a construct containing the whole 
prepro-seguence secreted up to 15 0 mg/L enzyme when grown 

2 0 in shake flask^. which is an approximately 100 fold increase 

compared to the vectors containing non homologous S. 
cerevlslae promoters and signal sequences. 
In another embodiment of the present invention the PGR 
technique is used in combination with the use of class IIS 

25 restriction enzymes, to facilitate primarily the functional 
new recombination of the described DNA fragments. 
As a typical example, BspVLZ, a constituent of the group of 
IIS restriction endonucleases, cuts every DNA sequence 4 bp 
5* of the specific recognition site "ACCTGC" , thereby 

30 generating 5' N4 protruding ends (reviewed by Szybalski et 
al, 1991) . The advantage of these enzymes, particularly in 
combination with PGR, is the nearly complete independence 
from a given sequence within modern molecular working 
procedures. Introduction of the recognition sequence into 

3 5 the non priming part of a primer used for PGR, allows 

subsequent generation of any desired end of the PGR 
fragment. 
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9 

Furthermore, this invention relates to a process for 
producing a desired expression product wherein a host cell 
comprising a vector according to the invention is cultured 
under conditions enabling the expression of the structural 
5 gene and optionally the isolation of the desired expression 
product. 

The invention also relates to a method for producing RNA, 
wherein a host cell comprising a recombinant nucleic acid 
sequence according to the invention is cultured under 
10 conditions enabling production of the RNA, whereby said 

r 

recombinant nucleic acid sequence further comprises a 
regulatory region operably linked to DNA encoding a 
specific RNA sequence not encoding a specific protein. Such 
a process can for example be used to produce RNA itself as 

15 the desired expression product, or to produce RNA that 

influences the formation of at least one metabolite as the 
desired expression product. The amount of metabolite can be 
increased by using the regulatory region or regions 
according to the invention (as described above in various 

20 suitable embodiments) in combination with a nucleic acid 
sequence encoding a protein that influences the formation 
of the metabolite. 

The production of anti-sense RNA, that binds to sense RNA 
encoding an expression product, can be used for decreasing 
25 the amount of said desired expression product, whereby the 
latter can be a specific protein or a protein influencing 
the formation of a metabolite* 

A nucleic acid sequence according to the invention can 
therefore also comprise an anti-sense nucleic acid sequence 
30 in combination with one or more of the regulatory regions 
according to the invention. 

A process for producing RNA according to the invention can 
be directed at the production of an RNA sequence that 
functions as a flavouring component. The nucleic acid 
3 5 sequence according to the invention is therefore not only 
to be considered useful for overexpression of a 
proteinaceous gene product but also for producing RNA. 
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10 

Brief Description of the Figures 
Figure 1. 

Protein sequence analysis of two forms of inulinase from K. 
5 marxianus after CNBr-digestion. Fragment 1 in figure 1 
corresponds to Seq. ID. No. 1 and fragment 2 in figure 1 
corresponds to Seq. ID. No. 2. The expected cleavage after 
a Met- residue is indicated by an arrow, small letters in 
the sequence indicate very likely residues. Identification 
10 is based on either homology with invertase (n) or strong 
suspicion. 

Figure 2. 

a) DNA oligonucleotides derived from the amino acid 

15 sequence of the internal CNBr-f ragments 1 and 2. Fragment 1 
in figure 2a corresponds to Seq. ID. No. 3 and fragment 2 
in figure 2a coreesponds to Seq. ID. No. 4. 

b) DNA probes from the mature N-terminus of secreted 
inulinase. The number of nucleotides is given in brackets, 

2 0 as well as the abbreviations used in the text (probe KLM 

04 corresponds to Seq. ID^ No. 5, probe KLM 05 corresponds 
to Seq- ID. No. 6, probe KLM 08 corresponds to Seq. ID. No. 
7 and probe KLM 09 corresponds to Seq. ID. No- 8) . In cases 
where mixed oligonucleotides are used during DNA synthesis, 
25 the corresponding letters are given; the orientation of the 
DNA oligonucleotides is mentioned. 

Figure 3. 

Nucleotide sequence (Seq- ID- No. 15) of the 280 

3 0 nucleotides long PGR fragment of the N-terminal coding 

region of the inulinase gene in pTZlBR- The localization of 
two corresponding PGR primers; as well as their code, is 
given. In the line "seq" the experimentally determined 
amino acid sequence of inulinase from K. marxianus is 
3 5 given, The deduced amino acid sequence (Seq. ID. No 16) is 
mentioned in the line below the DNA sequence, here, all 
amino acids identical to amino acids of Saccharomyces 
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cerBvlsia.e invertase are underlined. 
Figure 4. 

First restriction endonuclease cleavage map of the region 
5 around the inulinase gene of iC.. marxlanus. Restriction 

sites were located through Kpnl double digestions. Not all 
restriction sites of the given restriction enzymes are 
mentioned. 0 kbp mark refers to the 5' end of the coding 
sequence of inulinase. In the upper part of the figure 
10 about 22 Kbp are mapped, while the lower part displays a 
refinement of approximately 2.5 Kbp around the target 
sequence. 

Figure 5 . 

15 Nucleotide sequence (Seq. ID. No. 9) of the inulinase gene 
(INUl) of iCIuyvez-ojnyces marxianus . The TATAAA box, 
transcription start sites, the putative MIGl binding site 
as well as the predicted recognition site for the signal 
peptidase (G-V-S-A-t-S-V-I) and the processing site for a 

20 KEX2-like endoprotease (K-R-t-) are indicated. Numbering 
starts with the ATG start codon, the deduced amino acid 
sequence (Seq. ID. No, 10) is given in one letter code 
below the coding part. 

25 Figure 6. 

Autoradiogram of the primer extension assay. Results are 
shown for the primer extension assays in the presence of 
[a-^^P]dCTP with total RNA from repressed [1] and 
derepressed grown cells [2]. The size of the obtained 
3 0 fragments is indicated. 

A: assay with primer p21T; 
B: assay with primer pl6T; 

C: specificity control; both primers with total RNA from S. 
cerevlsiae . 
3 5 For further details see text. 

Figure 
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Schematic representation of the construction for the 
inulinase promoter/signal sequence link to a-galactosidase 
with the help of PCR generated fragments. The beginning of 
mature protein and the first amino acid of the pre-protein 
5 is indicated by arrows. Digestion of plasmids such as for 
example pSKl with £:coRI and Eagl removes the DNA part 
comprising the GAL7 promoter and the SUC2 signal sequence 
in such a way, that an in frame fusion of inulinase signal 
sequences with the a-galactosidase gene was directly 

10 possible. For the in frame fusion of the whole prepro- 
sequence to a-galactosidase an oligonucleotide 
complementary to the coding strand was used as PCR primer, 
said oligonucleotide comprising the recognition site of 
BspMI. After PCR, digestion of the product with i^coRI and 

15 BspMI created sticky ends, that were compatible with the 
ends of the original vector, for example pSKl. By changing 
the hybridizing part of the PCR primer a similar fragment 
was obtained for the in frame connection of the pre- 
sequence to a-galactosidase. 

20 

Figure 8 . 

Schematic representation of the construction routes of 
plasmids, suitable for the expression of a heterologous 
gene, here for example a-galactosidase, within JC. lactls. 
25 The construction route is only given for an episomal 

plasmid based expression system. The sequence of the a- 
galactosidase gene is indicated in black, shadowed areas 
indicate yeast sequences, solid lines are bacterial 
sequences, the direction of transcription is indicated. 

30 

Figure 9 . 

Schematic representation of the construction route of 
plasmids pUR2431 and pUR2432, examples of construction 
intermediates for the expression of a-galactosidase from 
3 5 the INU promoter, containing either the intact prepro- 
signal sequence or only the pre- part of the inulinase 
signal sequence. The promoter sequence is shaded,, the empty 
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box indicates both versions of the signal sequence, the 
exaniple of a heterologous gene, here a-galactosidase is 
given as a black box. The direction of transcription is 
indicated by arrows. 

5 

Figure 10. 

Schematic representation of the construction of plasmids 
pUR2433, plIR2434, pUR2435 and pUR2436, all variants of 
episomal expression plasmids for the expression of a- 
10 galactosidase in Kiuyverojijyces strains, having leu2 
determined auxotrophy. 

Figure 11. 

Schematic representation of the construction of pUR2437 and 
15 pUR2438, vectors for integrating multiple copies of a 

homologous or heterologous gene, such as ' a-galactosidase, 
into the rDNA of K. marxlanus. The overall structure of one 
rDNA unit as well as the 3.5 kbp EcoRl fragment actually 
used are drawn schematically. 

20 

Figure 12 - 

Sequence of the marxlanus URA3 gene (corresponding to 
Seq. ID. No. 11) and its deduced amino acid sequence 
(corresponding to Seq. ID. No. 12) . 

25 

Figure 13. 

The construction of plasmid pKMU2, which was used for the 
construction of a food-grade K. marxianus Ieu2 mutant. (A) 
Plasmid pKMLl contains a K. marxianus LEU2 gene on a 5 kb 
30 EcoRT fragment. (B) . Plasmid pKMU2, where the intact LEU2 
gene is replaced by a leu2: :URA3 disruption. The small 
boxes indicate the URA3 promoter and terminator regions. 

Figure 14. 

3 5 Structure of plasmid pKUR2 4 31 (A) and the chromosomal 

organization of the INUl locus after integration of the 
plasmid at the Xhol site. (B) . 
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Figure 15 

Structure of plasmids pUR24 3 9 and pUR2440 which contain K. 
marxianus DNA 5* to the previously cloned sequences. These 
plasmids are based on pBluescript (Stratagene) and the 
5 inserted K. marxianus DNA is shown as the dark shaded 
boxes . 

Figure 16. 

The sequence of the 479 bp Eco'Rl fragment 5* to the 
10 previously cloned sequences. The sequence (corresponding to 
Seq. ID. No. 13) begins with the 5' EcoRl site from the 
previously cloned INUl DNA. 

Figure 17. 

15 Sequence (corresponding to Seq. ID. No. 14) of primer INUT 
used for sequencing across the ^coRI site 5 ' to the 
inulinase gene. 

Figure 18. 

2 0 Structure of plasmid pUR2 44 5 which contains the 470 bp 

EcoRl fragment from pUR2440 5' to the INUl sequences in 
pUR2434. The location and orientation of the approximatley 
470 bp EcoRl fragment is indicated. 

25 

Exper imenta 1 

The following experimental section is offered by way of 
example and should not be considered a limitation of the 
scope of the invention. 

30 

Molecular biological procedures 

Mostly standard methods were used as described in Sambrook, 
J., Fritsch, E-F., & Maniatis, T. , 1989. Molecular Cloning. 
A laboratory Manual. Second edition. Cold Spring Harbour 

3 5 Laboratory Press.. Any modifications used are described 

below. 
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strains, plasmids and growth conditions 

E. coli strain JM109 (endAl, recAl, syrA96, thi, hsdR17, 
rk', mk"^, relAl, supE4 4, Yanish-Perron et al • , 198 5) was 
used for amplification of plasmids. Transformation of JM109 
5 was carried out according to Cohen et al • , 1973. 

K. marxianus var. jnarxianus CBS 6556 (= ATCC 2 6548) was 
obtained from the Yeast Division of the Centraalbureau voor 
Schimmelcultures, Delft, The Netherlands, and maintained on 
YEPD agar (1% yeast extract, 2% peptone, 2% glucose, 2% 
10 agar) slopes • 

Genomic DNA was isolated from a 200 ml YEPD overnight 
culture and incubated with lyticase (Sigma Chemical 
Company) from ArthroJbacter luteus according to the 
manufacturer* 

15 Total DNA was isolated as described by Struhl et al . 
(1979), 

The marxianus strains were transformed with the plasmids 
pUR2431 up to and including pUR2445. Transformation of the 
iCluyverojziyces strains was performed as described by Carter 

20 et al . , 1988 • Transf ormants were recovered on selective 
YNB-plates (0.67% YNB, 2% glucose, 2% agar) supplemented 
with the essential amino acids (tryptophan 2 0/ig/ml or 
leucine 20/ig/ml) • The same liquid medium was used for 
precultures, cultivated twice overnight at 3 0°C and diluted 

25 1:10 in YPmedium containing 5% sucrose (YPS) for 
derepression of the INU promoter. 



Example 1. Generation of DNA oligonucleotides 

To acquire a set of DNA oligonucleotides for PCR a set of 
3 0 mixed DNA oligonucleotides corresponding to the recently 

determined N-terminal amino acid sequence of forms I and II 

was synthesized [Rouwenhorst et al . , 1990]. 

As a potential source for further DNA-probes and to apply 

the PCR technique, the amino acid sequences of two internal 
3 5 CNBr fragments from the secreted inulinase form I and the 

ceil wall bound inulinase, form II were determined, 

Therefore the reduced and carboxyamidomethylated proteins 
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were subjected to overnight incubation in 70% formic acid 
in the dark under N2- After addition of water the mixtures 
were freeze dried and yielded the CNBr-digests of inulinase 
forms I and II respectively. 
5 Separation of the obtained fragments was achieved by 
reversed phase chromatography using a Bakerbond C4 wide 
pore column (4.6*250mm) mounted in a Waters HPLC. Elution 
was achieved using a linear gradient of acetonitrile in 
0.1% Trif luoroacetic acid in water. Detection was carried 

10 out at 214 and 254 nm. 

The chromatograms obtained from digests of inulinase forms 
I and II showed the presence of a number of poorly resolved 
peaks; the overall pattern however was similar for both 
digests and enabled collection of two fractions from both 

15 runs which were subjected to sequence analysis. The 

outcomes of the runs are given in Fig.l (corresponding to 
Seg. ID. No. 1 and 2) . 

No differences could be detected in the amino acid 
sequences derived for the isolated fractions between forms 
20 I and II. Peptide bond hydrolysis has occurred in one case 
at a C-terminal Trp residue which is rare but not 
impossible [fragment 2 in Fig.l/Seg. ID. No. 2] under the 
given circumstances. 

The nucleotide sequence was selected in such a way, that 
25 PGR could generate the genetic information of the 

intervening sequence. From these sequence results sets of 
mixed oligonucleotides were synthesized using an Applied 
Biosystems 38 0 A synthesizer. The sequence of these DNA 
oligonucleotides is given in Fig. 2 (cotresponding to Seq. 
30 ID. No. 3 to 8) . 

Example 2. Cloning the 5' coding region of inulinase 

With two of the obtained DNA oligonucleotide probes from 
the N-terminal and internal protein sequence [KLT^IGQ, resp. 
35 KLM06] PGR amplification on total K. marxianus genomic DNA 
was carried out (Perkin Elmer Cetus DNA Thermal Cycler) in 
100 Ml 10 mM Tris HCl, pH 8.3, 50 mM KCl, 1.5 mM MgCl2, 
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0.001% gelatine, with 0.2 mM of each dNTP, 100 pmol of the 
DNA oligonucleotides KLM06 and KLM09 , approximately 0.5 fig 
of BamRl digested DNA and lU of Amplitaq polymerase. 
Incubation parameters were set as follows: 32 cycles/ 1 min 
5 95 ^C/ 2 min 50 °C/ 2*5 min 72 

The reaction formed a specific 290-bp fragment, which was 
subcloned into the Smal site of the E. coll plasmid pTZlSR 
(Mead et al, 198 6) and introduced into E. coll JM109 by the 
transformation protocol described by Chung et al . (1989). 

10 One of the positive clones, designated pUR2415, was further 
characterized, the DNA isolated and purified according to 
the Qiagen (Qiagen Inc., Chatsworth California) protocol 
and subsequently characterized by DNA sequencing. The 
sequence of the PCR clone is given in Fig. 3 (corresponding 

15 to Seq. ID. No. 15) . 

Comparison of the gained DNA sequence with known sequence 
data of invertase from 5. cerevisiae further confirmed the 
authentic origin of the PCR product, since it displayed a 
20 significant sectional homology with the supposed N-terminus 
of invertase as described by Rouwenhorst et al . (1990). 

Example 3. Restriction map of the DNA aroiind the 

25 intilinase gene 

The NcoI-BamHI fragment of pUR2415 was further used as a 
^^P labelled probe for the construction of a physical map 
of the DNA region around the 5 ■ coding sequence of the 
inulinase gene. Therefore chromosomal DNA of K. marxlanus 

3 0 was digested with several restriction enzymes separately 

and in combination. After electrophoresis of DNA fragments 
the gel was placed for 15 min in 0.25 M HCl, 15 min in 0.4 
M NaOH, 0.6 M NaCl and 15 min in 0 . 5 M Tris, 1.5 M NaCl. 
The DNA was transferred onto Hybond N-filters (Amersham 

35 International pic.) by vacuum blotting (LKB 2016 Vacugene)' 
for 2 hours in lOxSSC (1.5M NaCl, 0.15M Na3citrate) and 
finally UV crosslinked for 15 min. 
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By using KpnX double digestions the resulting signals were 
arranged into a first physical map spanning about 25 Kbp, 
(given in Fig. 4) . 

5 Example 4. Cloning of the inulinase gene 

Results of the chromosomal restriction analysis revealed 
two positive overlapping DNA fragments of 2.0 Kbp for EcoRl 
and 4.0 Kbp for Pstl, respectively. To isolate clones 
containing the inulinase promoter, the signal sequence and 

10 the polyadenylation/termination sequences both digested DNA 
pools were subcloned into pTZ19. Therefore about 8 of 
chromosomal DNA was digested with EcoRT and Pstl separately 
and resolved by agarose gel electrophoresis- DNA fragments 
of about 2 • 0 Kbp from the EcoRl digest and the fragments 

15 between 3.5 and 4.5 Kbp from the Pstl digest were isolated 
from the gel and purified with the Geneclean II kit (Bio 
101 Inc) . A small amount of both digestions was again 
loaded onto an agarose gel, the bands transferred to Hybond 
membrane, and hybridized with the ^^P labelled PGR fragment 

2 0 to verify the presence of the hybridizing band within the 

isolated pool. Since both fractions contained the 
corresponding DNA fragments the isolated EcoRl and Pstl DNA 
fragment pools were ligated into the resp. digested pTZ19 
plasmids and transformed into E. coll JM109 by standard 
25 procedures. The colonies obtained were subjected to colony 
hybridization after replica plating them onto Hybond-N 
filter (Amersham International pic) and plasmid 
amplification on LB-plates with containing 500 fig 
chloramphenicol ml^-^. After 8 hours incubation at .37®C each 

3 0 filter was subjected to the following wash procedure: 5 

min.in 1.5 M NaCl, 5 min in 0.5M NaOH and twice in 1.5M 
NaCl, 0.5M Tris HCl. The DNA was finally fixed to the 
filter by UV crosslinking. 

Hybridisation was done in 50mM Tris pH 7 . 4 , 10 mM EDTA pH 
35 7.0, IM NaCl, 0.5% SDS , 0.1% Na-pyrophosphate , 0.2% ficoll, 
0.2% polyvinylpyrolidone, 0.2% BSA and 0.01 mg denaturated 
salmon sperm DNA at 68 ^C. The added [a-^^P]dCi:F (Amersham 
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International pic; 370 MBq/mL; 110 TBq/mmol) labelled DNA 
probe was prepared by using a Multiprirae DNA labelling kit 
from Amersham Corporation, purified by elution over a 
Sephadex G-50 column in TES (lOmM TriHCl pH 7.4, 1 mM EDTA 
5 pH 8-0, 0.2% SDS) and then denatured by incubation for 2 
minutes at 100 °C prior to use. After overnight incubation 
the filters were washed 2x for 20 min in 2xSSC, 0.1% SDS, 
0.1% Na-pyrophosphate; 2x 2 0 min in O.lx SSC, 0.1% SDS, 
0.1% Na-pyrophosphate at 68°C. Positive clones were 
10 detected after overnight exposure of the dried filters with 
a Kodak-X-ray film. 

To verify the specificity of the obtained spots, filters 
enclosing presumably positive clones were washed for 3 0 min 
in 2% SDS at 90°C and rehybridized with a DIG labelled DNA 
15 probe of the PGR fragment, by using the DIG luminescent 
detection kit from Boehringer Mannheim according to the 
manufacturer's protocol. 

Plasmid DNA was isolated from putative positive colonies, 
digested with appropriate enzymes and analysed by Southern 

20 hybridization as follows: after electrophoresis of the 

digested DNA the gel was placed for 15 min in 0.25 M HCl, 
15 min in 0.4 M NaOH, 0.6 M NaCl and 15 min in 0.5 M Tris, 
1.5 M NaCl. The DNA was transferred onto Hybond N-f liters 
(Amersham International pic.) by vacuum blotting (LKB 2016 

25 Vacugene) for 2 hours in lOxSSC and finally UV crosslinked 
for 15 min. In this experiment the probe was labelled 
through random primed incorporation of DIG-UTP according to 
the protocol of the manufacturer (DIG DNA labelling kit, 
Boehringer Mannheim) . After overnight hybridisation with 

3 0 the denaturated probe at 68 °C in 5x SSC, 0,1% N- 

laurylsarcosine, 0,02% SDS, 1% blocking agent) filters were 
washed as described in the manufacturers instructions. 
The fragments found in the plasmids which hybridized with 
the probe were in reliable agreement with the physical map 

35 given in Fig. 4. One of the 2.0 Kbp EcoRl insert containing 
pTZ19R plasmids, designated pUR2421, and one 4.0 Kbp Pstl 
fragment comprising pTZ19R plasmid, designated plIR2422, 
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were further utilized to determine the DNA sequence of the 
total inulinase gene. 

Example 6. DNA sequence analysis of the complete 

5 inulinase gene 

DNA sequencing was mainly done as described by Sanger et 
al . , 1977, and Hsiao et al • , 1991, using the Sequenase 
version 2.0 kit from United States Biochemical Company, 
according to the protocol with T7 DNA polymerase (Amersham 
10 International pic) and [a-^^S]dATP (Amersham International 
pic: 370 MBq/ml; 22 TBq/mmol) . 

The complete sequence was determined from the recombinant 
plasmid pUR2421 and from the plasmid pUR2422 by subcloning 
fragments and by primer walking strategy. Both DNA 

15 fragments showed the expected overlap* In summary, 3223 bp 
were sequenced on both DNA strands, including the promoter 
region extending over about 0.75 Kbp upstream of the 
putative start codon and a sequence of about 0.8 3 Kbp 
behind the putative stop codon, including the putative 

20 polyadenylation site and termination regions.. The sequence 
is given in Fig. 5 (corresponding to Seq. ID. No. 9 and 
10) . Sequence comparison of the coding part of the 
inulinase gene of the present invention showed about 98% 
homology with the very recently published inulinase coding 

25 sequence of JC. marxianus , ATCC 12424 (Laloux et al., 1991). 
The homology is less striking for the 50 nucleotides before 
the prepro-sequence given in the same publication, 
corresponding to about one third of the leader sequence 
before the start codon from pUR2421. This variation is 

3 0 probably due to strain variations. 

At the amino acid sequence level, invertase and inulinase 
display a homology of 69%, a homology,, which is even higher 
than the homology between invertases from 5. cerevisiaB and 
SchwanniomycBs ocaidentalis , Therefore, both enzymes should 

3 5 be treated as variations of the same enzymatic activity 
rather than as different enzymes. 

Since the N-terminus of secreted mature inulinase was 
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identified by protein sequencing, it was easy to 
distinguish the coding part of the precursor protein, 
having a deduced 2 3 amino acid sequence displaying some 
characteristic prepro- features. 270 bp in front of the 
5 supposed ATG start codon a TATAAA box was identified, 
indicating the presence of a promoter element. 

Example 7. Determination of transcription start 

To test the functionality of the detected promoter 
10 structure and to identify the transcription start points of 
the cloned inulinase fragment, primer extension experiments 
were carried out. For the primer extension assay two DNA 
oligonucleotides were used, complementary to nucleotides 
98 to -84 and 18 to 48 of the given DNA sequence given in 
15 Fig. 5 (which corresponds to Seq. ID. No. 9). 

primer p21T 5'- AGC ACT GAC TCC TGC CAA TGG AAG CAA GAG 
(Seq. ID. No. 17) 

primer pl6T 5'- TCT CTA TGG CAT AGA GA (Seq. ID. No. 18) 
20 To further confirm the specificity of the signals, two 
different total RNA preparations for the reverse 
transcription were chosen. Therefore, JC. marxianus cells 
were grown as described in YPS medium (Rouwenhorst at al- , 
199 0) under non-repressive conditions. From a 100 ml 
25 culture, cells were harvested and total RNA was isolated as 
described (Koehrer et al., 1991). A reverse transcriptase 
reaction was carried out in the presence of either [a- 
^2p]dCTP or [a-^^S]dATP. For each experiment about 10 
RNA and lOOpg of primer were dissolved in 4 0 /xL 
3 0 hybridisation buffer, containing 50 mM Tris.HCl pH 8.3, 75 
mM KCl, 3 mM MgCl2, 10 mM DTT and 40 U RNA'se inhibitor 
(Boehringer Mannheim) . The mixture was incubated at 65 °C 
for 5 min. and slowly cooled down to room temperature. 1 /xL 
[a-'^^P]dCTP (Amersham International pic: 370 MBq/ml; 110 
35 TBq/mmol) or [a-^^S]dATP, 1 (25 U) reverse transcriptase 
(Biolabs) , dATP, dGTP and dTTP to a final concentration of 
0.1 mM, dCTP to a final concentration of 0.01 mM were added 
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to a final volume of 50 ^1 and incubated for 1 hour at 37 
°C (for reactions containing [a-^^S]dATP the nucleotide 
concentrations were 0 . 1 luM for dGTP, dTTP and dCTP and 0.01 
mM for dATP) . The mixture was precipitated with ethanol and 
5 subsequently loaded onto a 5% Polyacrylamide gel togetzher 
with the DNA sequence reactions generated from the same 
primers (Fig. 6). 

In all experiments, three dominant signals emerged, 
coinciding in each case to ^-ivo ^-167 Fig. 5 

10 (which corresponds with Seq. ID. No. 9). These nucleotides 
are located about 100 nucleotides behind the TAT3UIA box 
[position -276 to -271 in Fig. 5]. The results of this part 
of the invention therefore relate the start of 
transcription for the inulinase gene within the region 

15 TAATCAGCAATT, defining the length of the uncommonly long 5» 
non-coding sequence as 174, 17 0 and 167 nucleotides. 
Directly behind this transcription initiation region we 
recognized a sequence [TAAATCCGGG, nucleotides -163 to -153 
in Fig. 5 (which corresponds with Seq. ID. No. 9)] that 

20 perfectly matches the MIGl binding consensus sequence of S. 
cerevlsiaG SUC2^ GAL4 and GALl genes [WWWWTSYGGGG] (Nehlin 
et al., 1991). The MIGl gene product is known to be 
involved in glucose repression of the GAL genes and 
directly controls SUC2 expression in S. cerevlsxae (Nehlin 

25 et al., 1990). In contrast to the presumably related SUC2 
gene, where the MIGl binding site is located at -446 to - 
4 35, this sequence motif is closer to the start codon in K. 
marxianvLS and more similar to the location of the MIGl site 
in the GALI and GAL4 promoters. But in contrast to the 

30 regulation of the GAL gene family of 5. CBrBVisiae , 
inulinase and invertase promoters seem to be solely 
regulated by glucose repression; a fact, allowing the 
construction of a strong, non-repressible promoter by 
exchanging the putative MIGl DNA binding site in the 

35 inulinase promoter. 

The operative importance of the sequence around the AUG 
start codon during initiation of translation in eukaryotes 
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is still a point of discussion, and has led to the 
formulation of a consensus sequence for S. cerevisiae 
(Hamilton et al . , 1987). The inulinase gene shows little 
homology with this sequence and even the formulation of a 
5 preliminary Kluyveromyces ATG context consensus sequence 
does not improve the homology significantly. Taking the 
known high expression level of inulinase in the natural 
host into account, the idea of improving protein expression 
through adaptation of the AUG context in KluyveromycBS mRNA 
10 on the basis of the present information seems to be less 
striking. 

Example 8. PGR of JCluyverojnyces regulatory sequences 

Sequence analysis of the prepro-sequence of the leader 

15 region of the cloned inulinase promoter in pUR2421 revealed 
5 amino acids after a predicted recognition site for the 
signal peptidase (G-V-S-A-i-S-V-I) (Van Heijne, 1986) also 
a putative processing site for a KEX2 like endoprotease 
(•.K~R-i-..) (Fuller et al . , 1988). To test the functional 

20 importance of the prepro-sequence, DNA oligonucleotides 
v^ere synthesized, one creating the complete prepro- 
inulinase sequence, and the second appropriate for the 
direct, in frame, attachment of the DNA coding for the 
inulinase signal peptide to a given coding sequence. 

25 Two oligonucleotides containing the BspKl recognition site, 
one with sequence information for new restriction sites and 
the complementary sequence for the DNA from the putative 
KEX2 protease site, and the second complementary for the 
signal peptidase cleavage site, were utilized for the 

3 0 generation of suitable promoter/ signal sequence fragments 
by PGR. For the assembly of promoter and secretion signal 
fragments from the inulinase gene, PGR amplification of 
this part of the inulinase gene was performed. Thereby, the 
primers were conceived in such a way that perfect couplings 

35 with mature alpha-galactosidase were obtained- Two versions 
were made, one with the inulinase pre-pro signal sequence 
and the other with only the inulinase putative pre signal 
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sequence (see fig*7.)- The following primers were used: 

INP 01: 5*-GGAATTCTCAAACCGAAATG-3 ' (Seq. ID- No- 19) 
INP 02 : 5 » -CCCAAGCTTACCTGCCATGGGCCCTCTTGTAATTGATAACTG-3 ' 
5 (Seq. ID- No. 20) 

INP 03 : 5 ' -CCCAAGCTTACCTGCATGCGGCCGCACTGACTCCTGCCAATG-3 ' 
(Seq. ID. No. 21) 

Two PCR-^mixes of 100 /xl were made, each containing 4 0 pg 
10 pUR2421 cut with Kpnl , 1.5 mM MgC12, lU AmpliTaq polymerase 
and buffers and NTP's as appropriate. One reaction mixture 
contained 100 pmoles of INP 01 and INP 02, the other 100 
pmoles INP 01 and INP 03. 25 cycles were performed in a 
Perkin Elmer Cetus DNA Thermal Cycler, each cycle 1:00 min 
15 95**C, 1:30 min 48 2:00 min 72 *C. Afterwards, they were 
treated with proteinase K (Crowe et ai . , 1991) before 
digestion with ^coRI and Hindlll was performed (4 8 hours at 
37°C)» The fragments were then isolated from a gel, and 
ligated with pTZ19R digested with ^coRI and ifindlll. The 
20 resulting plasmids, pUR2427 and pUR2428 (INP 01/INP 02 and 
INP 01/INP 03 PCR-products respectively) , were transformed 
into E. coli and several colonies for both constructions 
were cultivated, the plasmids isolated, purified and the 
sequence confirmed by DNA sequence analysis. 

25 

Example 9. Construction of JC« letctls expression 

plasmids 

The E. coli - Kluyveromyces shuttle vector pSKl is a pKDl 
derivative (Chen et al . , 1989, Bianchi et al . , 1987) and 

3 0 contains a unique ^agl site at the junction between the S. 
cerevisiaQ SUC2 (invertase) signal sequence and the ct- 
galactosidase gene. The vector also comprises a 
K, lactis TRPl gene as selectable marker in K. lactis and 
the ampicillin resistance gene as selectable marker in E. 

3 5 coli. 

Digestion of said plasmids pUR2427 and pUR2428 with EcoRl 
and BspMI produces two fragments, which could be easily 
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ligated with the Ecom/EagX digested Kluyveromyces vector 
pSKl, thereby only replacing the Eco'R.Z-Ea.gT fragment, 
containing the GAL7 promoter and SUC2 signal sequence with 
the ^coRI/SspMI fragments from pUR242.7 and pUR24 28. This 
5 results in two vectors, one with the DNA sequence encoding 
the expected signal peptide (= pre-seguence) directly 
linked to the a-galactosidase gene (plIR2429), and a second 
with the DNA encoding the natural prepro-sequence in frame 
linked to the a-galactosidase gene (pUR2430) . 

10 Said episomal plasmids could immediately be used to 

transform the trp' mutant strain of K. lactis ^ for example 
by electroporation (Bolen et al . , 1990) in which strain 
pSKl derivatives are known to be stably maintained. 
Expression and secretion of a-galactosidase could be 

15 determined under induced and non induced conditions by 
known procedures (Verbakel, J., 1991). 

By digestion of said plasmids with EcoRX and Hindlll a 
fragment comprising both the promoter and the DNA encoding 
the leader sequence, as well as the a-galactosidase gene 

20 can be transferred into existing vectors for targeted 

homologous recombination into the lactis rDNA (Bergkamp 
et al . , 1992). The potential of these constructs for 
stable, multicopy integration into the genome, has been 
shown for different organisms, genes and auxotrophic 

25 markers (Lopes, 1990; Verbakel, 1991, Giuseppin et al . , 

1991) . Substitution, for example, of the GAL7 promoter and 
SUe2 signal sequence in the plasmids pMIRKGAL-TAl, 2 and 3 
for said inulinase promoter prepro- or pro-sequences, 
followed by transformation of the constructs into X. lactis 

30 MSKllO (a, uraA, trpl: :C7i?A3) , should give high and stable 
expression and secretion of a-galactosidase. 
DNA fragments comprising the whole given promoter sequence 
or functional parts thereof, but without the prepro- 
sequence coding part can also be used for intracellular 

3 5 overexpression of homologous and/ or heterologous genes- 
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Example 10. Construction of X. marxianus episomal 

plasmids 

To evaluate the function of the obtained DNA sequences in 
the natural host, the pTZ19R derivatives pUR2427 and 
5 pUR24 2 8 were digested with ^coRI and BspKl , to release the 
PCR fragments, which were further used to simply replace 
the i:coRI/£agrI fragment in pSY9, an E. coll/ S. cerevisiae 
shuttle containing a unique Eagl site at the junction 
between the S. cerevisiae SUC2 (invertase) signal sequence 

10 and the a-galactosidase gene and an EaoRl site in front of 
the GAL7 promoter; the vector also comprises a LEU2 gene 
copy from 5. cerevisiae and the ampicillin resistance and 
the MBl origin for maintenance and selection in E. coll (M. 
Harmsen, unpublished) . The two variant vectors, one with 

15 the direct connection of the expected signal peptide to the 
a-galactosidase gene (pUR2432) , and the second with the 
natural prepro-sequence in frame linked to the a- 
galactosidase gene (pUR2431) are not able to replicate in 
K. marxianus (Figure 9) . Finally, for obtaining the 

20 Kluyveromyces episomal expression vectors, the naturally 
occurring plasmid pKDl was linearized with EcoRX and 
ligated into the EcoRT site of pUR2331 and pUR2332, thereby 
yielding 4 new plasmids, with the DNA encoding either the 
pre- or the prepro-sequence and in each case both possible 

25 orientations of the pKDl vector backbone (pUR24 3 3 - 
pUR243 6) . 

Exauaple 11. Construction of 2C. marxianus integrating 

plasmids 

30 To obtain mitotically stable integration into the ribosomal 
DNA locus of the K. marxianus genome homologous rONA 
sequences were used to target the integrating linearized 
vector into the rDNA locus. In addition to its use in S. 
carevlslae^ this approach has been successfully proven for 

35 multicopy integration into the rDNA locus of K. lactis 
using the vectors pMIRKMl and pMIRKM2 (Bergkamp et al . , 
1992) . These plasmids include either the LEU2 or the LEUZd 
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genes of S, cerevisiae and homologous rDNA sequences from 
K. marxianus . Since the heterologous LEU2d gene apparently 
is not able to functionally complement the LEU2 gene 
disruption in the K. lactis strain (Bergkamp et al . , 
5 personal communication) only the vector with the intact LEU 
gene was used for further K. marxlanus constructions. A 
very recently cloned 3.5 kb EcoRl fragment of the K. 
marxianus rDNA, containing the 3' end of the 17S rDNA-, the 
5.8S rDNA-and the 5» end of the 26S rDNA gene, which has 

10 recently been cloned, but has not yet been completely 

sequenced (Bergkamp, personal communication) was ligated 
into the EcoRT sites of pUR2431 and pUR2432. After 
transformation into E. coli the transf ormants were found to 
contain only plasmids in which the inulinase- a-gal 

15 expression cassette was joined to the 26S rDNA gene part 
(pUR2437 comprising the prepro-sequence, and pUR2438 
containing the pre-seguence) . 



Example 12. Expression and secretion in JC. majrxxaxius 

The outcome of the complete construction process was 6 
different expression plasmids for K. marxlanus, with the 
25 following characteristics: 

pUR243 3: episomal vector; 

INU promoter + prepro-sequence + a gal in 
orientation I 
3 0 pUR24 34: episomal vector; 

INU promoter + prepro-sequence + a gal in 
orientation II • 
pUR24 35: episomal vector; 

INU promoter + pre-sequence + a gal in 
35 orientation I 

pUR2436: episomal vector; 

INU promoter + pre-sequence + a gal in 



wo 94/13821 



PCT/EP93/03547 



28 

orientation II. 

pUR2437: integration vector; 

INU promoter + prepro-sequence + a gal in 
orientation I. 
5 pUR2438: integration vector; 

INU promoter + pre-sequence + a gal in 
orientation I. 



The four episomal plasmids and the two integration vectors 
10 were (after linearization at the unique XJbal site) 

transformed to the existing K. marxianus Ieu2 strain KMSl 
by known procedures (Carter at al.; 1988). In this strain, 
the gene coding for S-isopropylmalate dehydrogenase was 
inactivated through integration of a dominant selection 
15 marker [G^^^g] under the control of the PGK promoter. The 

resulting strain is leu", Neo^. Some of the acquired clones 
were grown overnight at 37°c in YNB medium and diluted 1:10 
into 50 ml of YPS in a shake flask and grown for 48 hr at 
37°c under non repressing conditions. The expression level 
20 of a-galactosidase was determined enzymatically intra- and 
extracellularly by known procedures (Verbakel, 1991; 
Giuseppin et al . , 1991). The copy number was preliminary 
estimated by Southern blot analysis. 

The a-galactosidase expression assays confirmed the benefit 
25 of homologous regulatory sequences for high level 

expression in marxianus. Application of the inulinase 
DNA promoter sequence increased the expression of a- 
galactosidase up to 150 fold, compared to experiments, in 
which the S. cerevisiae PGK-or GAL7 promoters were used 
3 0 (data not shown) . Moreover, the natural connection of this 
promoter to the corresponding signal sequence, led to 
nearly 100% secretion of the heterologous protein. Here, 
the use of the complete precursor sequence, including the 
pro-peptide sequence appears not only to 
35 increase the amount of secreted a-galactosidase, but also 
the amount of protein produced. This finding is in some 
conflict with the conclusion given by Fleer et ai . , 1991, 
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where the deletion of the pro sequence of human serum 
albumin (HSA) did not influence the capacity of Kluyvero- 
myces lactis to express and secrete rHSA. On the other 
hand, both experiments manifest, that the final proteolytic 
5 removal of the pro peptide from the mature product, 

presumably by a KEX2 equivalent protease, is not a rate 
limiting step. Whether the pro-sequence plays an 
appreciable role in secretion, translation or mRNA 
stabilization, cannot yet be decided. Some influence of the 

10 orientation of the DNA cassette within the plasmid has been 
found, at least for the episomal expression systems 
(compare pUR2433 with pUR2434, and pUR2435 with pUR2436) . 
This effect might be related to -not detected- copy number 
variations, plasmid stability effects, or transcriptional 

15 interference with other transcription processes on the 
plasmid in orientation I* 
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trans- 


OD 55 0 


a-gal 


secreted 


copy 






WW WAX 


J 


no « 






[mg/L] 






PUR2433-1 


10. 7 


50.4 


99,7 


20 


-2 


10. 0 


35,0 


99.6 


20 


PUR2434-1 


9.5 


114.2 


99.7 


15 


-2 


11,7 


149,6 


99.7 


20 


pUR2435-l 


10.5 


28,8 


90.5 


20 


-2 


10. 1 


24 . 7 


91.5 


20 


DUR2 4 3 6 — 1 


11 . 1 


82.7 


83.1 


20 


-2 


10 . 6 


58 . 4 


84 . 3 


10 


pUR2437-l 


9.9 


0.23 


83.6 


1 


-2 


11.4 


0.21 


90.5 


1 


-3 


10.5 


0. 14 


71.4 


1 


PUR2438-1 


11.0 


0.09 


77.8 


1 


-2 


11. 0 


0. 12 


75 . 0 


1 


-3 


10.5 


0. 14 


85.7 


1 



The expression of the integration vectors (pUR2437 and 
20 plIR2438) was very low, a result, which could be correlated 
to the low copy number present in the cell. One possible 
explanation for this effect is the presence of the intact 
heterologous, S. cBrevislae LEU2 marker gene on the 
cassette; since this promoter might be strong enough to 
25 supply the leu deficient cell with sufficient gene product, 
even in this single copy configuration. 

By taking benefit of the very recently cloned and sequenced 
LEU2 gene of K. marx±a.nus (Bergkamp et al., 1991), one can 
further enhance stability and expression of homologous and 
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heterologous genes in the described K. marxianus strain 
KMSl, by replacing the LEU2 copy from S. cBrevislae in 
pUR24 3 7 and pUR24 3 8 with the corresponding homologous LEU2 
and LEU2d promoter deficient gene copies from K. marxianus 
5 on the integration cassette. 

Moreover, the cloned LEU2 gene can be further used to 
obtain a disruption/deletion leu2 auxotrophic mutant strain 
without insertion of heterologous DNA. 

10 Example 13. Construction of KMS3 

For the construction of a strictly homologous K. marxianus 
leu2 mutant the URA3 gene of -strain CBS 6556 was isolated 
first and subsequently utilized for the disruption of the 
LEU 2 locus • 

15 The last step in the biosynthesis of pyrimidine is 
catalysed by orotidine-5 • phosphate carboxylase ( EC 
4-1.1-23), an enzyme which in cerevisiae is encoded by 
the URA3 gene. The URA3 gene is one of the most commonly 
used selection markers because of the availability of 

20 counter selection for the marker (Boeke et al . , 1984). 

Yeast cells having an active URA3 gene are unable to grow 
in medium containing 5-f luoro-orotic acid (5-FOA) , while 
ura3 mutants grow normally. 

The URA3 gene of K. marxianus CBS 6556 was isolated by 
25 screening a genomic marxianus DNA bank (Bergkamp et al., 
1991), inserted in the vector lambdaL47.1 (Loenen et al., 
1980) with a radioactively labelled S. cerevisias URA3 DNA 
fragment. Three phage clones, which hybridized with the 
URA3 probe were isolated by standard techniques* 
30 Restriction analysis followed by Southern analysis in all 
cases detected a 2.5 Kbp EcoRl/Sphl fragment which was 
subcloned in pUC19 , resulting in plasmid pKMUl . 
This insert carried by this plasmid and several subclones 
thereof were sequenced by described methods; the DNA 
35 sequence and the corresponding amino acid translation are 
given in Figure 12 (corresponding to Seq. ID. No. 11 and 
12). The determined DNA sequences showed 71% homology on 
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the DNA level and 81% homology on the amino acid level with 
the corresponding S. cerevlsiae URA3 gene and the product 
(Rose et al. , 1984) . 

For the construction of the food-grade leu2 mutant 
5 spontaneous ura- mutants were selected by plating K. 
marxianus wt cells on 5-FOA plates. Out of 10^ cells, 4 
uracil requiring mutants were obtained; one of these -named 
KMS2- was used for further construction work. 
Plasmid pKMU2 was constructed by replacing parts of the 5.1 

10 kb EcoRl LEU2 fragment in plasmid pKMLl (Bergkamp et al. , 
1991) with the K. marxianus URA3 gene as. indicated in 
Figure 13. A 1 kb StuX/EcoRV fragment containing a large 
part of the coding sequence of the LEU2 gene was replaced 
by a 2.5 kb EcoRX/Sphl fragment containing the K. marxianus 

15 URA3 gene, giving plasmid pKMU2. EcoRX and SphI sticky ends 
were made blunt by use of T4 DNA polymerase in the presence 
of all four dNTP's The linear 6 . 5 kb EcoRl fragment, 
containing the Ieu2 gene disruption 1bu2z :URA3 was further 
used to transform KMS2 by electroporation as described by 

20 Meilhoc et al. 199 0 with selection on medium lacking uracil 
to select for uracil prototrophy. From the 75 transf ormants 
obtained, 12 also displayed leucine requirement, indicating 
that in 63 transf ormants heterologous recombination had 
occurred, these transf ormants were not investigated any 

25 further. The Southern analysis of 3 of the 12 leu" 

transformants confirm that in all cases the wild type copy 
of the LEU2 gene has been replaced by leu2: :URA3 fragment. 
One of the newly obtained leu2 transformants, designated 
KMS3, was stable even during long term growth under non 

30 selective conditions. This non-reverting K. marxianus leu2 
strain is suitable for overexpression of homologous or 
heterologous proteins in, for example, the food industry. 

Example 14. Single copy integration of the a-galatosidase 
3 5 gene cassette into the INUl locus 

Multi-copy integration of a mRNA producing promoter-gene 
cassette into the rDNA locus generates an unusual DNA 
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arrangement consisting of sequences from the gene desired 
for expression which are transcribed by RNA polymerase II 
and the stable rRNA genes, transcribed by RNA polymerases I 
and III* To test the potential influence of the surrounding 
5 rDNA on the expression of a-galactosidase under the control 
of the INU promoter, this combination also tested at a 
different locus. The cassette was therefore integrated into 
the inulinase, INUl , locus through single cross over, 
thereby recombining the INU promoter with the wild type 5' 

10 upstream sequence. 

For the construction of inulinase integration plasmids the 
Cyamopsis tetragonoloba a-galactosidase cassette with the 
described promoter fragment and the prepro-sequence of the 
K. marxianus inulinase (INUl) gene, and the S* cerevisiae 

15 PGK terminator, were combined in a plasmid incapable of 
replication in K. marxianus • To achieve this, the 804 bp 
long EcoRl/BspKl fragment of plasmid pUR2427 was ligated 
into the EcoRl/EagZ digested plasmid pSY9 (N. Harmsen, 
unpublished) . The resulting plasmid, pUR2431 (Figure 9) was 

20 linearized with Xhol within the promoter sequence of the 

INUl gene prior to transformation to strain pUR24 31 thereby 
preferentially targetting the integration into the 
chromosomal INUl locus. The expected integration event 
creates a chromosomal situation in which the a- 

25 galactosidase gene is placed under the regulation of the 
INUl promoter within the wt. chromosomal 5' DNA context, 
whereas the chromosomal INUl gene is placed under the 
control of the cloned INUl promoter fragment used in the 
fusion constructs (Figure 14.)- 

3 0 The acquired transf ormants were analyzed for both a- 

galactosidase- and inulinase production, a-galactosidase 
was measured as described earlier, while inulinase was 
measured as described by Rouwenhorst et al. (1988) . Results 
for 4 different transf ormants are summarized below; the 

35 total a-galactosidase production and the extracellular 
inulinase activity of pUR2431 transf ormants and of the 
untransf ormed yeast strain KMS3 after growth for 24h in YPS 
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medium are given. 



transf ormant/ 
strain 


OD 660nm 


a- 

galactosidase 
production 
(jng/1) 


ihulinase 
activity 
(U/ml) 


IGl 


7.8 


0.02 


150 


IG2 


8.0 


0.02 


189 


IG3 


7.2 


0.2 


233 


IG4 


8.7 


18.3 


13 


KMS3 


8.3 




151 



10 

The transformant designated IG4 showed a strikingly high cc- 
galactosidase production level compared to the other 3 
transf ormant s , concomitant with rather low inulinase 
production. Sothern analaysis of all 4 transf ormants 

15 revealed that only in transformant IG4 was plasmid pUR243l 
integrated into the chromosomal inulinase locus in the 
expected manner, while in all three other cases the 
• integration event occur ed elsewhere in the genome (not 
shown) . Hence, the low inulinase production of this 

20 transformant might be caused by the lack of a further 
activating 5 ' DNA sequence element not present on the 
utilized fusion construct, an interpretation in accordance 
with the low expression results of the other transf ormants 
where the integration took place elsewhere and where the a- 

25 galactosidase gene is not under the control of the INUl 
promoter and further 5' upstream chromosomal inulinase 
sequences . 

3 0 Example 15. Cloning of further INUl upstream sequences. 

To obtain additional upstream sequences of the INUl 
promoter, which may enhance the expression directed by the 
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described INUl promoter, total chromosomal DNA of K. 
marxianus was isolated as described by Struhl et al. (1979) 
and digested with different restriction endonucleases , all 
having a recognition sequence within the first 500 bps of 
5 the cloned and sequenced DNA fragment containing the INUl 
locus. The fragments were separated by electrophoresis and 
the agarose gel subsequently subjected to Sothern analysis. 
For the identification of additional 5' sequences, plasmid 
pUR2421 was digested with EcoRl and Ncol and the about 290 
10 bp long DNA fragment containing 5' sequence of the cloned 
promoter was isolated and used for the synthesis of a 
digoxygenin labelled DNA fragment, which was subsequently 
used as a DNA probe. 

From the obtained specifically hybridizing signals, the 

15 ^agl digestion product with an apparent lenght of about 1.9 
kb and the Xhol digestion product with a length of 
approximately 1 kb were chosen for further cloning. 
The ^agl digested K. marxianus DNA was separated by 
electrophoresis and fragments in the region of 1.9 kb 

20 purified from the gel by the procedure described earlier. 
These fragments were ligated into EagX digested Bluescript 
vector (Stratagene) and the products introduced by 
transformation into E. coll JM109. The transf ormants were 
secreened by colony hybridization as described earlier 

25 using the DIG labelled DNA probe containing 5' sequences 
from the imJl promoter described above. A similar approach 
was used to clone the XhoT fragment containing upstream 
sequences from INUl but in this case the vector was 
digested with Xhol prior to ligation with XhoX digested K. 

3 0 majTxianus DNA of approximately 1 Kb in size. 

Using these techniques plasmids pUR2440 and pUR2439 were 
identified which contain the approximately 1.9 kb EagZ 
fragment and the approximately 1.1 Kb XhoZ fragments of the 
3 5 WUl promoter respectively (Figure 15) . The DNA sequence of 
an approximately 470 bp region immediately 5* to the 
previously cloned INUl sequences was determined by the 
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techniques described earlier and is shown in figure 16 
(corresponding to Seq. ID. No. 13) . Plasmid pUR2440 vas 
deposited in the Centraalbureau voor Schimmelcultures, 
Baarn, The Netherlands under accession number CBS 648.93. 

5 

Example 16. Construction of expression vectors containing 
longer derivatives of the INUl promoter. 

To obtain an autonomously replicating vector carrying the 
extended INUl promoter sequence plasmid pUR24 34 was 

10 partially digested with EcoRI and the approximately 11.6 Kb 
linear fragment isolated from a gel and dephosphorylated. 
The approximately 470 bp BcoRI fragment from pUR2440 was 
subsequently ligated into this vector. The ligation mix was 
used to transform E. coll JM109 to ampicillin resistance 

15 and the plasmid DMA from the resulting transf ormants 

analysed by digestion with Z^col to identify those which 
contained the insert adjacent to the existing INUl promoter 
sequences in pUR2434 . Sequencing across the relevant ^coRI 
site in pUR2440 using primer INUT (Figure 18) which 

20 hybridizes within the previously sequenced region of the 
INUl promoter was carried out to determine the sequence of 
the extended promoter. The orientations of the cloned 
approximately 470 bp EcoRl fragments were confirmed by 
sequencing using primer INUT. Two plasmids were identified 

25 which carried the fragment in the correct orientation and 4 
were found with the fragment in the opposite orientation. 
A still longer upstream region can be cloned into pUR24 34 
by cloning of the 1.4 kb fragment from pUR2440 produced by 
partial digestion with -E'coRI into pUR2434 partially 

3 0 digested with £;coRI as described above. The correct 

orientation of the fragment can be easily determined by 
digestion of the resulting plasmids with Eagl which will 
release the approximately 1.9 kb fragment cloned in 
pUR244 0, 

35 Plasmid pUR2445 (Figure 18) carrying the extended promoter 
and pUR2434 were introduced by electroporation (Bolen et 
al., 1990) into K. marxianus strain KMS3 with selection for 
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leucine prototrophy . Representatives of the resulting 
transf ormants were grown overnight in minimal medium at 37 
°C and diluted 1:10 into 10 ml of YEP, 5% sucrose induction 
medium. These cultures were grown for a further 24 hours at 
5 37°C and the a-galactosidase levels in the fermentation 
meduim determined. The results are shown below: 



TRANS FORMANT 
STRAIN 




a-GALACTOSIDASE 
(UNITS) 


KMS3 pUR2434 (1) 


24 


7 


KMS3 PUR2434 (2) 


22 


10 


KMS3 PUR2434 (3) 


24 


11 


KMS3 PUR244 5 (1) 


27 


24 


KMS3 PUR244 5 (2) 


19 


18 


KMS3 PUR244 5 (3) 


27 


21 



From these results it appears as if the extension of the 
promoter has a beneficial effect upon enzyme production 
levels . 

2 0 Multi-copy rDNA integrative plasmids carrying a longer INUl 

promoter can similarly be constructed using the 1.9 kb Eagl 
fragment from pUR244 0. The integrative vector pUR2437 can 
be partially digested with JS'coRI and either the 
approximately 47 0 bp EcoRl fragment from pUR244 0 or 1.4 kb 
25 fragment formed by partial digestion of pUR2440 with ^:coRI 
inserted upstream of the existing INUl sequences. The 
resulting ligation mix can be introduced into E.coli JM109 
and plasmids containing inserts identified by digestion 
with J^coRI. The orientation of the approximately 470 bp 

3 0 fragment could be confirmed by sequencing and that of the 

approximately 1.4 kb jecoRI fragment by digestion of the 
plasmids with Eagl which should give a fragment of 
approximately 1.9 kb in addition to those derived from the 
vector. The plasmids so produced can be linearized by 
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digestion with XJbal and introduced by transf orntantion into 
X. marxianus strain KMSl or KMS3 with selection for leucine 
prototrophy, 

5 
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Samples of Escherichia coli transformed with plasmids 
pUR24 21 and pUR24 22 were deposited under the Budapest 
Treaty at the Centraalbureau for Schimmelcultures (CBS) in 
Baarn, The Netherlands on 27 May 1992. They received 
5 deposit numbers CBS 2 65.92 and CBS 2 66.92, respectively. 
These plasmids were mentioned in the specification in 
Examples 5 and 6 and in Example 8. Plasmid pUR2440 (example 
15) was also deposited at the CBS under accession number 
CBS 648*93. 
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C L A I K S 

1. h nucleic acid sequence derivable from a yeast and 
comprising at least a regulatory region derivable from a 

5 gene encoding a polypeptide having inulinase activity or a 
modified sequence of said nucleic acid sequence also having 
regulatory activity. 

2. A nucleic acid sequence according to claim 1, wherein 
10 said yeast belongs to the genus Kluyveromyces , preferably 

belonging to the species Kluyveromyces marxianus r more 
preferably said yeast being the strain K. marxianus var. 
marxianus deposited at the CBS at Baarn, The Netherlands 
under number CBS 6556. 

15 

3. A nucleic acid sequence according to any of the 
previous claims, comprising at least one region selected 
from the group consisting of a promoter, a termination 
signal, and a sequence encoding a secretory signal 

20 necessary for secreting a gene product from a yeast, the 
latter preferably being derivable from the inulinase gene 
of a JCIuyveiTOjnyces yeast. 

4. A nucleic acid sequence derivable from a yeast and 

25 comprising at least one regulatory region derivable from a 
gene encoding a polypeptide having inulinase activity 
according to any of the previous claims, comprising at 
least a part of the nucleic acid sequence of Figure 5 or an 
equivalent nucleic acid sequence, preferably comprising at 

3 0 least polynucleotide -73 7 to -1 of Figure 5 having promoter 
activity or comprising at least polynucleotide 1 to 48 of 
Figure 5 encoding the inulinase pre-sequence. 

5. A recombinant nucleic acid sequence according to any 
3 5 of the previous claims, wherein the regulatory region is 

operably linked to DNA encoding a specific RNA sequence not 
coding for a specific protein. 
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6. A vector comprising a nucleic acid sequence according 
to any of the claims 1-4, said nucleic acid sequence being 
operably linked to a structural gene, such as a gene 

5 encoding a polypeptide having inulinase activity or a~ 
galactosidase activity, preferably a yeast vector. 

7. A recombinant host cell comprising a nucleic acid 
sequence according to any of the claims 1-5 or a vector 

10 according to claim 6. 

8- A process for producing a desired expression product, 
wherein a host cell comprising a vector according to claim 
6 is cultured under conditions enabling the structural gene 
15 to be expressed and optionally the resulting desired 
expression product is isolated. 

9. A process for producing RNA, wherein a host cell 
comprising a recombinant nucleic acid sequence according to 
20 claim 5 is cultured under conditions enabling production of 
the BHAf whereby preferably the resulting RNA influences 
the formation of at least one metabolite or the resulting 
RNA is a flavouring component. 

25 10. Use of a part of a nucleic acid sequence according to 
any of claims 1-4 as a probe or a primer said part having a 
length of at least 10 nucleotides. 
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Pig.l- internal sequences of inulinase after CNBr-digestion 

fragment 1 from Inulinase forms I and II 

* * 
M-i-v-I-D-Y-n-N-T-S-G-F-F-n-S-S-V-D-P-R-Q— r-A-V-A-V 

fragment 2 from Inulinase forms I and II 
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Fig. 2a. DKA oligonucleotides derived from amino acid 
sequence analysis 



a. DNA probes from the CNBr fragments 1 and 2 



06 K=G/T 



1: M-4 


-V- 


I- 


D- 


Y- 




N- 


T- 


ATG 


GTT 


ATT 


GAT 


TAC 


AAC 


AAC 


ACT 




C 


C 


C 


T 


T 


T 


C 




G 


A 










G 




A 












A 


3' TAG 


CAK 


TAK 


CTG 


ATG 


TTG 


TTG 


TG 5' [23] 


2: D- 


P- 


K- 


V- 










GAT 


CCA 


AAG 


GTT 


TTC 


TGG 






C 


T 


A 


C 


T 










G 




G 












C 




A 










3 • CTG 


GGK 


AAT 


CAK 


AAG 


ACC 


5' 


[18] KLM 07 



K=G/T 
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Fig. 2b • DNA probes from the N-terminal fragment 



04 K=G/T 

05 R=G/A 

08 K=G/T 

09 Y=T/C 







A- 


I- 




G- 


T- 


IT- 










AAG 


GCT 


ATT 


ACT 


GGT 


ACT 


ACT 


TTC 








A 


A 


C 


C 


A 


C 


C 


T 










C 


A 


A 


C 


A 


A 












G 




G 


G 


G 


G 








3 ' 


TTT 


CGK 


TAK 


TGK 


CCK 


TGK 


TGK 


AAG 


5' 


[24] 


3 ' 


TTT 


CGR 


TAR 


TGR 


CCR 


TGR 


TG 




5' 


[20] 


5' 


AAG 


GCK 


ATK 


ACK 


GGK 


ACK 


ACK 


TTT 


3' 


[24] 


5' 


AAG 


GCY 


ATY 


ACY 


GGY 


ACY 


AC 




3 ' 


[20] 
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Figure 3: Inulinase PGR fragment in pUR 2415. 



Inulinase PCR fraqroent in pUR2415: 

KLM 09: (Y=T/C) 
AAGGCYATYACYGGyACYAC 
1 ggtacccAAGGCTATCACCGGCACCACTTTCAGTTTGAACAGACCTTCTGTGCATTTCAC 6 0 

KAITGTTFS -LN R E S V H F T 

seq SGDSKAITGTTFSLNRPSVPFT 



Ncol 



6 1 TCCATCCCATGGTTGGATGAACGATCCAAATGGTTl'GTGGTACGATGCCAAGGAAGAAGA 12 0 

P S H GWMNDPNGLWYD A K E K D 
seq. P 

121 CTGGCATTTGTACTACCAGTACAACCCAGCAGCCACGATCTGGGGTACTCCATTGTACTG 180 
W H L Y Y O Y N P A A T I W G T P L Y W 

181 GGGTCACGCTGTTTCCAAGGATTTGACTTCTTGGACAGATTACGGTGC'TI'CCTTGGGCCC '? 4 0 
G H A V S K D L T SWTDYGASLGP 

24 1 AGGTTCCGACGACGCTGGTGCGTTCAGTGGTAGTATGGTCATAGACTACAACAACACggg 3 0 0 

TA CC AKT AKCTG ATGTTGTTGTG 

KLM 0 6 (K=G/T) 
G S D D A GfKTSGSHV I D Y N N 
seq. MVIDYW NTS 

seq. = a. a. sequence determined with amino acid sequencing 

Underlined amino acids are identical to Saccharomvces cerevisiae invertase' 
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Figure 4: Cleavage map. 
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Figure 5: Nucleotide sequence of the inulinase gene 
(corresponding to Seq. ID. No. 9. 

















•737 CAAIICI 


CAAACCCAAA 


/.•I 




icGr.ncr.iir, 


1 lACccr.Ar.r. 


1 Aiccr.cn c 


lAGl TCCCAr 


7CGGGAtGCA 


AAAAAA 1 r.A 1 


cncAicitc 


AGi lAcricc 


C17GAC.ir.AA 




•A30 


rTACTCCCIG 


AAAClArCAC 


CACTfT rnrc 


ATCCCGGCI I 


K.ir.rcct.AA 


icacacac/.c 


A( aCacaCai' 




Accf.ci n.M 








JCMAAJ I CI 


It ICCCGICC 


UGJlf ICK.I 


f,f A 1 1 1 1 ] rc 


I CGGG 1 r 1 f 1 


f,{ (T.{ AfClA 


rifAiCAccr 


A( ijLi r*i dill. 




• C*>0 


1 1 1 CACCCAl 


GCCGCAGCt A 


rr.ACiCACi G 


CCIGCCTCCC 


1 Gr.C7GAClC 


ACTGCC7GAC 


tccaggaaaa 


GAGCCl IILC 


AAGC.AAAAAL 


.v.l 


•360 


TTTTCCTCTC 


TTAAICCGGC 


CCTGCCCCCC 


TCCl LCAAAA 


TCCA7C77CA 


7 CACAAGGAG 


171GAAAAAA 


CAAAAA/AU 


CACA1 A1 AAA 




•270 


AACCCTATCT 


CCAGA7CTCA 


AACTC7CCCT 


ICAATCCTGT 


77CCCAC77G 


TAAC7CA7CC 


177A7TCT7C 


TAT7CTA7C7 


CTCICTTICC 


-181 


■160 


TTCCCaMT 


CACCAAITAA 


ATCCCCCCIA 


AGGAAGAAT t 


ACTACTCrci 


C7AACCCr7A 




TT7A777rT7 


7 7 7ccAr roc 


91 


-90 


CATAGAGAAA 


GAAAAAAAAA 


AAAAGAGACT 


1 iGi GAAGA7 






C7CACACATr 


TAA7TT77I7 


7 7 7C7 I ACAl 


. ] 


1 


ATCAAGTTAG 


CATACTCCCT 


CITGCITCCA 


TIGCCAGGAG 


TCAClGCnC 


AGTIA7CAAY 


lACAAGACAC 


ATCCTGACAG 


CAAGGCCAIC 


90 




H r L A 


Y S L 


LLP 


t A G V 


S A S 


V { H 


Y r R |.0 


CDS 


K A 1 




91 


ACTAACACCA 


CTTTCAGTTT 


GAACACACCT 


TC7C7GCA77 


7CAC7CCA7C 


CCAYGGTTGG 


ATCAACGATC 


CAAA7GGTT7 


GTGGTACGA7 


IfiO 




T M T I 


r S i 


U ft P 


S V H f 


TPS 


H C V 


N H 0 P 


H C ( 


U Y f* 




181 


CCCAAGCAAC 


AACACICGCA 


T17GTACTAC 


CACIACAACC 


CAGCAGCCAC 


GA7C7GGCC7 


AC7CCAT1G7 


ACICGGG7CA 


CGC7C7T7CC 


?70 




A K £ E 


0 U H 


L Y Y 


0 Y H P 


A A 1 


I U G 


1 P I T 


UGH 


A V S 




271 


AAGCAT1TGA 


CTTCTTCGAC 


ACATTACGGT 


GCTTCCTIGG 


CCCCACCTTC 


CGACGACCCT 


CCTGCCTTCA G7CCTACTA1 


GGT1ATCCA7 


360 




C D I T 


S U T 


0 Y G 


A S I G 


PCS 


0 D A 


OAFS 


C S H 


V I t) 




561 


TATAACAATA 


CTTCTGCni 


CllCAACAGC 


7C7GTGGACC 


CAAGACAAAC 


AGCAGTTGCA 


UlCi GGAC7 1 


1 U 1 L 1 AAbUb 


CCCAACCCAA 






Y N N T 


S G f 


f N S 


S V 0 P 


R Q R 


A V A 


V U 7 L 


see 


P S 0 






CCCCAGCACA 


TCACTTACTC 


GTICGACCGT 


GG7 7ACACCT 


TCCAACACTA 


TICCCACAAC 






CTCCAAC7]C 


SCO 




A 0 H i 


SYS 


L 0 G 


G Y T r 


0 H Y 


S 0 M 


A V L D 


i N S 


S « I 




541 


AGAGACCCTA 


ACCTGTTCTG 


GCACCACCGC 


GAGAACGGCC 


AAGATCCTCC 


T7CGA7CA7C 


GCCGTTGCTG 


AA7CGCAAG7 


GT7CTCTCIG 


630 




R D P K 


V F W 


H £ C 


E N C C 


OCR 


U t H 


A V A E 


S 0 V 


F S V 




631 


TTGTTCIACT 


CTTCTCCAAA 


CYTGAAAAAC 


7GGACCT7CG 


AATCCAACYT 


CACCCACCAC 


GGC7GGAC7G 


GTACCCAAlA 


CGAAIGCCCA 


7?0 




L F Y S 


S P M 


L i: N 


WILE 


5 N F 


7 H H 


G U T C 


TOY 


e c p 




721 


CCTCTAGTTA 


AGCT7CCATA 


CGACAG7GT7 


GC7CAC7CT7 


CY7CGAACTC 


C7CCGACYCC 


AAGCCACAC7 


CCCCATGCG7 


C77GT77C1C 


810 




G L V i: 


V P Y 


CSV 


A 0 S S 


S h S 


SOS 


K P 0 S 


A U V 


I F V 




8U 


TCCATCAACC 


CTCCTGCTCC 


AY7GCC7CC7 


1C7C7 7ACCC 


AATAC7TTCT 


7GC7GACTTC 


AACCCTAC7C 


AC7 7CACTCC 


AAICGACGAC 


9^0 




S 1 H P 


G G P 


I C C 


S V 7 Q 


Y f V 


G 0 F 


M G T H 


t 7 P 


1 D 0 




901 


CAAACCACAl 


ICCIACACAT 


GGG1AACGAC 


7AC7ACCCAC 


7ACAAAC7 7 7 


C7 TCAACAC7 


CCAAACGACA 


AGGACC7C1A 


CGGlAlCCCA 


990 




0 T R f 


I 0 M 


G K 0 


Y Y A L 


Q 1 F 


F N T 


P H E K 


0 V Y 


. G 1 A 




991 


TCGGCTTCTA 


AC1GGCAA7A 


CGCCCAACAA 


GCCCCAACTG 


A7CCA7GGCG 


77CA7C1ATG 


AG777GG77A 


GACAATICAC 


A71CAAACAC 


1080 




U A S M 


U Q Y 


A Q Q 


A P T 0 


PUR 


S S M 


S I V R 


0 F 7 


I K 0 




1081 


TTCACCACAA 


ACCCTAACTC 


CCCCGA7CTC 


CTCnCAACA 


CTCAACCAG7 


CITGAACIAI 


GATCCATTGA 


GAAAGAACCG 


TACCACI7AC 


1170 




F S T H 


P W S 


A D V 


V I W S 


Q P V 


L H Y 


0 A L R 


KMC 


7 7 Y 




1171 


ACCA1CACAA 


ACTACACGG7 


CACCTCCCAA 


AACCGCAACA 


AGA7CAAGC1 


AGACAACCCA 


TCCCCTTCIC 


TIGAA7TCCA 


7CT1GAA7AC 


1260 




S 1 T H 


Y T V 


1 s e 


H L t K 


1 C I 


0 M P 


S G S L 


E F H 


I E Y 




1261 


CTCTTTAACC 


CCTCCCCACA 


1A7CAAGAGC 


AACGTC7TCC 


C7CA7C7T7C 


C77C7AC7TC 


AAGGG7AACA 


ACCACGACAA 


CCAATAC7 7G 


)3S0 




V F H G 


S P 0 


1 t S 


H V f A 


0 I S 


L Y F 


i: G U N 


DON 


E Y I 




1331 


ACATTCCC7T 


ACCAAACCAA 


CCC7CCT CCC 


TTC7TC1TCC 


ACCCTCCCCA 


CACCAACATT 


CCTTTCC7CA 


ACCAGAAC7T 


C7lCTTCAAf 


K40 




R t G Y 


e T U 


G G A 


f F L 0 


R G H 


7 K 1 


p F V i: 


E W I 


F F K 




)4tl 


CACCAATTCC 


CACrTACCAA 


CCCACTTTCC 


AACIACACCA 


CAAACGTC7I 


CCACCT r TAC 


GCICTCAJ7C 


ACAACAACA7 


CA7CCAA77G 


1530 




H Q L A 


V T N 


P V S 


H T 1 T 


H V F 


D V Y 


C V t 0 


t U 1 


1 E L 




1531 


TACTTCCACA 


ACCC7AACCT 


CCTCTCCACC 


AACACTTTCT 


TCTTCTCTAC 


CAACAACCT7 


Ar7CCrCAAA 


77CACA7CAA 


CTCACCA7AC 


1620 




Y F 0 K 


G N V 


V S 7 


N T F F 


f S 1 


N N V 


1 C E 1 


0 I K 


SPY 




1621 


GAOWGGCTT 


ACACCATTAA 


CYCATTTAAC 


CTrACCCAA7 


TIAACGTITG 


ATCrGAFCTG 


CTTACTTTAC 


taaccacaaa 


AAAAATCAAA 


1710 




D K A Y 


T I W 


S f N 


V I 0 F 


N V • 













1711 AAAAAAAAAA CAATCAGTCC 77C7CTTCTT ACGAlAlGAT A7GATTAAAT GA7GC7A7GA AA7CA7C77C 77CnAAC77 7C77AAATC7 1800 

1801 7ACACCTCAC T7ACTC7ATA 7ACCCGT77A CCT77CCCTC CTCACACCCA CA7TTTA7A7 AACrC7ACC7 ArT7rCY77T 77717 77AAA 1890 

1891 AAAT77C7AT 7C7AACC7TA CAAACCTCCC C77!AAACCA CCTCTCC7CC CACTA7ATC7 77A7CA7C7C CCCC7CCC77 TCCCT77CCG 19B0 

1981 1T7CCC7T77 CC777CAAT7 AG7GCCC7CC AATICCCAAC 7CA7T7ICCC A7C7CAAAC7 AA7TCTCCAA ACC7I7AACA TCAAACAA77 ?07G 

2071 GAAAAGATIC ATCA7CACCA CAAATAACAA AAACA T CAAC ACAACACl 7A ATAACACTAC CAAACAAACA 7CGCICCaG7 CAAAACCCAA 2160 

216} ccaacaaacc 7CArfccAr7 rcccrcTACA crcAtTAiAC acaiaccaat tccac7Caci aacaaaatca citrcAAAri tcacgatcac 22SO 

22!>1 CCTCTCC7AA AACAA77TCA CCCCAACACC AtCAlA7CCC AlATlCdCA ACAAACCCAA 77C77CAACa AaMCIIGGA C77C7ACCC1 23<.fi 

23^1 T77GCCAAAC CACllTCCri ClACGACAAC AICACCCIaC 77CC7CC77C AACClACCAC C77ACCA1CA lCAAt7GClC CCaCCAACAG 2^30 

2^.31 CATCC77C7C ACCGCCAC7C GCAAACGAA7 CGAICCCCAC ACAACACC7C 7GGCC7 2^86 
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A II vbruiisa I ion with pruiirr pl6 

Figure 6 . 

2 c 




C T G A 



B: HybridisaLion willi primer p2l 




A C T 
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Figure 7. 

a) Invertase signal sequence - o-qaIact,osidase linkage: 

<inv. ss< > Q-gaJ ^- > a-gal > 

A A E N' ■ EagI A E N " 

"T GCG GCC GAA AAC " > ^ GCC GAA AAC" 

\h CGC CGG CTT TTG " CTT TTG" 

b) Inulinase prepro- 1 i nk 

i sign pept . i mature inulinase 

SASVINYKRDGDSKAITN 

< ■ TCAGTGCTTCAGTTATCAATTACAAGAGAGATGGTGACAGCAAGGCCATCACTA/^C * > 

3 ' -GTCAATAGTT^TGTTCTCcp 

GggTACCGTCCATTCGAACCC-5 » 

I PGR 

SASVINYK R Apal Ncol Hindll l 
<'TCAGTGCTTCAGTTATCAATTACAAGAGGGCCCATGGCAGGTAAGCTTGGG 
<'"AGTCACGAAGTGAATAGTTAATGTTCTCGCGGGTACCGTCCATTCGAACCC 

BspNI 

t BspMI 

SASVINYKR 

< "TCAGTGCTTCAGTTATCAATTACAAGAG 
<'AGTCACGAAGTCAATAGTTAATGTTCTCCCGG 



c) Inulinase pre-link 

1 signal pept. 4 mat. 

LLLPLAGVSASVINY KRDG 
<*CCTCTTGCTTCCATTGGCAGGAGTCAGTGCTTCAGTTATCAATTACAAGAGAGATGGT' 
> 3 » -GTAACCGTCCTCAGTCACGc 

GgcGTACGTCCATTCGAACCC-5 ' 

i PGR 

LLLPLAGVSA NotI SphI Hindll l 

< * CCTCTTGCTTCCATTGGC AGGAGTCAGTGCGGCCGCATGCAGGTAAGCTTGGG 

< " GG AG AACG A AGGT AACCGTGCTCAGTCACGCCGGCGTACGTCCATTCG AACCC 

BspMI 

1 BspMI (or NotI) 

LLLPLAGVSA 
< " CCTCTTGCTTCCATTGGCAGGAGTC AGTGC 
<^GGAGAACGAAGGTAACCGTCC1*CAGTCACGCCGG 
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Figure 8, 





Par lial cut witti EcoRi 
cui with Eagi 

Isolate 10-1 kb 
vector 



Cui with EcoRi and BspKA\ 
Isolate 804 bp tragment 




pUICi42K 




Cut with £cof^l c'H\(j //b^pMl 
Isolate 783 bp iragmoni 
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Figure 9 . 




Cut wih IzcoRl and Eagl 
isolate 6.1 kh vector 




Oft 

Kim 



i 

Cut with EcoRl and BspMl 
Isolate 804 bp fragment 

; 

ligaic 





|)UK2^?.S 

'W.l I |.;« 




i 

Clut with EcoR\ and BspM] 
Isolate 7R3 bp lra(',nicnt 

4 
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Figure 11. 



ZiroKI 




Fsii 
HindMl 



Linearize 
with jEcoRI 



Ligatc 



Kluyveromyces marxianus 
rDNA unit 





Isolate 3.5 kb 
EcoRI fragment 



Linearize 
with jBcoRI 



Ligate 





HiaAXW 
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Figure 12. 



GfiGCR ICTTGCTCTT CTGflGCTCRT TRTflCCTCRfl TCfl8flflCIGfl HflTTflGGTGC • -191 

CTGTCRCCGC TCTTTTTTTfl CTGTflCCTGT GRCTTCCTTT CTTRTTTCCfl RDGRTGCTCR TCRCRRTRCG -121 

CTTCTflGflTC TRTTRTGCRT TRTRRTTRRT flCTTGTflCCT flCRRRRGGTR RRRGRflflGTC CCGGGCRGGC -351 

RRCflflTRGRfl RTCGGCRRflfl RflflflCTflCnG RRRTRCTRRG RGCTTCTTCC CRTTCRCTCR TCGCflTTTCG -281 

RRflCRRGRGG GGRRTGGCTC TGGCTRGGGR flCTRRCCRCC RTCGCCTGRC TCTRTGCRCl RRCCRCGTGR -211 

CTRCRTfllRT GTGRTCGTTT TRflCRTTTTC flflRGGCTGTG TGTCTGGCTG TTTCCflTTRR TTTTCRCTGR -Ml 

TTRRGCflGTC fllRTTGRRTC IGRGCTCRTC RCCRRCRRGfl RRTRCTRCCG TRRflflCTGTR RRRGTTCGTT -71 

TRRRTCflTTT GTfiflflCTGGfl RCRGCRRGRG GRRGTRTCRT CRGCTRCCCC CRTRRflCTRR TCRRRGCRGG -I 

RTG TCG fiCT RRG RGT TRC ICG GRR RGR GCR GCT GCT CRT RGR RGT CCR GTT GC7 CCC RRG 60 

net Scr Thr Lys Ser Tyr Ser Glu flrg filo fllo Rio Hia Rrg Scr Pro Ua I Rio fllo Lya 20 

CTT TTR ARC TTG RIG GRR GRG RRG RRG ICR ARC TTR TGI GCT TCT CTT GRT GTT CCT RRR 120 

Leu Leu Ran Leu Hct Glu Glu Lya Lya Ser Ran Leu Cya fila Scr Leu Rap Ua I Rrg Lya 10 

fiCR GCR CflG TTG TTR RGR TTR GTT GRG CTT TTG GGT CCR TRT RTC TGT CTR TTG RRG RCR 180 

Thr Rio Glu Leu Leu flrg Leu Ua I Glu Uo I Leu Gly Pro Tyr lie Gya Leu Leu Lya Thr 60 

CRT GTR GRT RTC TTG GRG GRT TTC RGC TTT GRG ART flCC RTT GTG CCG ITG RRG CRR TTR 210 

Hia Ual flap Me Leu Glu flap Phe Scr Phe Glu flan Thr lie Ug I Pro Leu Lya Gin Leu 60 

GCR GRG Rflft CflC RflG TTT TTG RTR TTT GRR GRC RGG RRG TTT GCC GRC RTT GCG RRC RCT 300 

fllo Glu Lya Hia Lya Phe Leu lie Phe Glu flap flrg Lya Phe fllo flap lie Gly flan Thr 100 

T RRR TTR CRR TRC flCG TCT GGT GTR TRC CGT RTC GCC GRR TGG TCT GRT RTC RCC ART 360 

Ual Lya Leu Gin Tyr Thr Ser Gly Uol Tyr Rrg lie RIq Glu Trp Ser Rap lie Thr flan 120 

GCR CRC GGT GTG ACT GGT GCG GGC flTT CTT GCT CGT TTG flflG Cflfl GCT GCC GRG CRR GTT 120 

fllo Hia Gly Ual Thr Gly Rio Gly lie Uol fllo Cly Leu Lya Gin Gly fllo Glu Glu Uol 110 

flCG Rflft Gflfl CCT flCR GGC TTG TTfl RTG CTT GCC GRG TTfl TCG TCC flflC GCG TCT CTR GCC 180 

Thr Lya Glu Pro flrg Gly Leu Leu FJel Leu fllo Glu Leu Ser Ser Lya Gly Ser Leu fllo 160 

CRC GGT GRR TRC ACT CCT GCG RCC GTG Cflfl flTT GCC flRG flGT GRT ARC GRC TTT GTT RTl' 510 

Hia Gly Glu Tyr Thr flrg Gly Thr Uol Glu lie fllo Lya Ser flap Lya flap Phe Ual Me 180 

GGR TTT RTT GCT Cflfl ARC CRT RTG CGT CGfl flCR CRR GflG GGC TRC CRT TGG TTG RTC RTG 600 

Gly Phe Me fllo Gin Aan flap flet Cly Gly flrg Glu Glu Gly Tyr flap Trp Leu Me Ret 200 

fiCG CCfl CGT GTT GGT CTT GflT GflC Aflfl GGT GRT CCT TTG GGfl Cflfl Cflfl TflC flCfl RCT GTG 660 

Thr Pro -Gly Ual Gly Leu flap flap Lya Gly flap fllo Leu Gly Cln Gin Tyr flrg Thr Uol 220 
A* 

GflT Gflfl GTT GTT GCC GCT GGfl TCfl GflC flTC flTT flTT GTT GGT flGfl CGT CTT TTC CCfl flflC 720 

Rap Glu Uol Uol. fllo Gly Gly Ser Rap Me Me Me Uol Gly flrg Gly Leu Phe Rio Lya 210 

^uR RGR GRT CCT GTR GTG GRR GGT GRG RGfl TRC RGfl flflC GCG GGfl TCG GRC GCT TRC TTG 780 

Cly flrg flap Pro Uol Uol Glu Gly Glu flrg Tyr flrg Lya fllo Gly Irp flap fllo Tyr Leu 260 

RRG RGfl GTfl GGC RGR TCC GCT TRR CRGTTCTCCG RGflflCflTGCfl GflGGTTCGflG TGTRCTCGGR 811 

Lya Rrg Ual Gly Rrg Scr Rio End 267 

TCRGflflGTTfl CflflGTTGRTC GTTTRTflTflT flflflCTflTflCR GRGflTGTTRG RGTGTflflTGG CRTTGCCTCfl 911 

CRTTGTRTflC 921 
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Figure 13 
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Figure 14 
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Figure 15. 



Drain 



XmnI 



lacZ' ^ .EcoRI 




XmnI 
Ahall. 



Xhol 



EcoRI 



Xhol 
Hindlll 
fEcoRI 
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Figure 16, 



1 GAATTCAGAG 

51 GGAGACGGAC 

101 6CCTTCAACC 

151 TGCCTAGACG 

201 GGCTACCCCA 

251 CAATTCTCAT 

301 CCCATCTGTG 

351 GCCGCCGGCC 

401 TTTTTTTTTT 

451 TTTTCGGAAT 



CACAAGAATC 
ACACTTGTTA 
GTTCTTTCCC 
GAGACGCCTG 
AGTTTAATTC 
TCCAGCTTGC 
CGCGCTCGTC 
CTTGTGTGCA 
TTTGTGTTTG 
TGAGGCGTTT 



TGGAATCCCC 
CCATTCGTTA 
CTGTGGGAGT 
CGAGCQATTC 
TAATTCCAAT 
AACTTGCAAC 
TGTGCATTTT 
TAATTTAGCG 
CTTCTCTTGT 
CTTTTT6GCG 



AGCTCTAGCA 
CCAATTGTTA 
GCCTAGGCTT 
CTGAGTTATC 
TCCAATTCCA 
TTGCAACTTG 
TCCCTTTTTT 
TTTTTCTGTT 
CGTTTTT6TG 
AATTC 



AATGTCA6GT 
ACACTACTTT 
ACCTTGGGAT 
CCGTTCCCGT 
ATTCTGATTC 
CAGTGGTAAC 
TTCCGGCA6C 
TGTATTTTTT 
TCTGTTTACC 
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Figure 17: Sequencing primer INUT 



5 » ACG CAC TAA TTG ACT CAA CC 3 • 
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Figure 18 . 
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